Dynamic Data Masking – Part 2–Security and Incorporating into Code

Comments 0

Share to social media

This is part 2 of my Dynamic Data Masking blogs for my upcoming book project. Part 1 is here: Dynamic Data Masking – Part 1 – The Mechanism.

Previously, in my first blog on Dynamic Data Masking – Part 1 – The Mechanisms, we had created the following table (I include this so you can see the DDL of creating the table with the MASKED WITH clause on the datatype. It interestingly comes before the NULL specification):

CREATE TABLE Demo.Person(
    PersonId int NOT NULL,
    FirstName nvarchar(10) NULL,
    LastName nvarchar(10) NULL,
    PersonNumber varchar(10) MASKED WITH (FUNCTION = ‘partial(1, "——-", 2)’) NOT NULL,
    StatusCode varchar(10) MASKED WITH (FUNCTION = ‘partial(0, "Unknown", 0)’) NULL
        CONSTRAINT CHKPersonStatus CHECK  ((StatusCode=’New’ OR StatusCode=’Inactive’ OR StatusCode=’Active’))
        CONSTRAINT DFLTPersonStatus  DEFAULT (‘New’) ,
    EmailAddress nvarchar(40) MASKED WITH (FUNCTION = ’email()’) NULL,
    InceptionTime date MASKED WITH (FUNCTION = ‘default()’) NOT NULL,
    YachtCount tinyint MASKED WITH (FUNCTION = ‘random(1, 100)’) NOT NULL
         CONSTRAINT DFLTPersonYachtCount  DEFAULT ((0))
         CONSTRAINT CHKPersonYachtCount CHECK  ((YachtCount>=(0))),
CONSTRAINT PKPerson PRIMARY KEY CLUSTERED
    (
        PersonId
    )
);

And we have two users (roles in a typical real setup!) that we want to give differing access to data:

    1. Those who can see all data (Test User: "BigHat"
    2. Those who can only see the first and last name of the person. Test User: ("SmallHat")

CREATE USER BigHat without login;
CREATE USER SmallHat without login;
GRANT SELECT on Demo.Person to BigHat;
GRANT SELECT on Demo.Person to SmallHat;

As it stands, the data raw looks like this:

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 0000000014   Active     fred@flintstone@slatequarry.net          1959-01-01    0
2           Barney     Rubble     0000000032   Active     barneyrubble@aol.com                     1960-08-01    1
3           Wilma      Flintstone 0000000102   Active     NULL                                     1959-01-01    1

And for the 2 users that are not sa/dbo but have SELECT rights:

EXECUTE AS User=’SmallHat’
go
SELECT *
FROM   Demo.Person;
go
REVERT;

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 0——-14   Unknown    fXXX@XXXX.com                            1900-01-01    23
2           Barney     Rubble     0——-32   Unknown    bXXX@XXXX.com                            1900-01-01    15
3           Wilma      Flintstone 0——-02   Unknown    NULL                                     1900-01-01    15

So let’s look at the different users. SmallHat user is fine as is. They were the target configuration, and the original masking goal has been met. Next, let’s go for BigHat. The goal is to let BigHat see all of the data in this table. To do this, that is a privilege you can grant called UNMASK.

First, let’s check the data for BigHat:

GRANT UNMASK ON Demo.Person TO BigHat;

Executing this, you get:

Msg 102, Level 15, State 1, Line 94
Incorrect syntax near ‘UNMASK’.

Which is a weird error messagge because the syntax issue is with ON, but it turns out that UNMASK is a database privilege, not an object one. So, you can grant users rights to see ALL masked data, or ALL unmasked data by database.

GRANT UNMASK TO BigHat;
GO
EXECUTE AS User=’BigHat’;
go
SELECT *
FROM   Demo.Person;
go
REVERT;

This returns:

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 0000000014   Active     fred@flintstone@slatequarry.net          1959-01-01    0
2           Barney     Rubble     0000000032   Active     barneyrubble@aol.com                     1960-08-01    1
3           Wilma      Flintstone 0000000102   Active     NULL                                     1959-01-01    1

But this opens up the user to ALL masked data in the database, not just one object or schema. So this does somewhat limit the value of the feature, since 1, it doesn’t stop dbo users from seeing the data, even casually, and 2, it is all db or none.

Hence, we really cannot do this for a multi-tier purpose at all, so if you have a user with rights between BigHat and SmallHat (call them MedHat, which we will use in the Row Level Security blogs coming next) that needs to mask some data, you may still end up using a view to do some or all of the masking. Column level security would allow us to limit access to some columns but column level security will make the output of different user’s queries take on different shapes to work. Data Masking allows the shape of the output to stay the same so the UI and querying layers can be simple.

While we are on security though, what about other uses in queries. First, what about a WHERE clause. If the data is masked, can you query for it?

EXECUTE AS User=’SmallHat’;
GO
SELECT *
FROM   Demo.Person
WHERE  YachtCount = 0; 
GO
REVERT;

Despite what the outputted masked yachtCount says.:

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 0——-14   Unknown    fXXX@XXXX.com                            1900-01-01    80

That is the row with 0, as we showed in the unmasked query.  Query for YachtCount = 1 and the output will be two rows. Searching for rows where PersonNumber = ‘0000000032’, you will get the rows that match:

EXECUTE AS User=’SmallHat’;
go
SELECT *
FROM   Demo.Person
WHERE PersonNumber = ‘0000000014’;
GO
REVERT;

Which returns:

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 0——-14   Unknown    fXXX@XXXX.com                            1900-01-01    7

The moral here is that need to be careful about how you use this feature. It is not as strict as column level security (or as Row Level Security will turn out to be, which is the next series of blogs to follow), so if a user has ad-hoc access to your db, they could figure out the data with some simple queries.

And what if you aggregate? The next query would be a head scratcher to the average user, especially if they new that PersonNumber was supposed to be a unique value (which probably isn’t you, reading this blog, unless you went to bed at 2AM and at 3AM a support call comes in):

EXECUTE AS User=’SmallHat’;
GO
SELECT SUM(YachtCount) as YachtCount, MAX(PersonNumber) as MaxPersonNumber, 
       MIN(PersonNumber) as PersonNumber, COUNT(*) as MatchingRows
FROM   Demo.Person
WHERE  YachtCount > 0
  AND  PersonNumber <> ‘xxxx’;
GO
REVERT;

YachtCount  MaxPersonNumber PersonNumber MatchingRows
———– ————— ———— ————
0           xxxx            xxxx         2

Note that for the aggregate output, it used the default default value for the masked values, but allowed YachtCount in the WHERE, so 2 rows were returned, though it seems like it shouldn’t be.

Next, what about if you cast the data:

EXECUTE AS User=’SmallHat’;
GO
SELECT PersonId, FirstName, LastName, PersonNumber, cast(PersonNumber as varchar(10)) as CastedPersonNumber,
                                      YachtCount, cast(YachtCount as tinyint) as CastedYachtCount
FROM     Demo.Person
WHERE YachtCount = 1;
GO
REVERT;

PersonId    FirstName  LastName   PersonNumber CastedPersonNumber YachtCount CastedYachtCount
———– ———- ———- ———— —————— ———- —————-
2           Barney     Rubble     0——-32   xxxx               47         0
3           Wilma      Flintstone 0——-02   xxxx               59         0

I did NOT see that one coming… While you don’t see the data, it is masked with the defaults, not the expected answer. In the case of YachtCount, it gives you a couple of doozies of answers that do not make sense with the WHERE clause.

This is CTP3.2, so it may change… if not, these limitations are confusing at best. In earlier CTPs the data was unmasked by the cast, so this may just be another improvment to come. Be sure to test yourself, and I will try to update the blog after RTM as I am using this as material for the book.

Next let’s attempt an UPDATE.  We will need to give the user rights to modify the table:

GRANT Insert, Update, Delete On Demo.Person to SmallHat;

Next, let do an insert, update, and delete to the table as the SmallHat user:

EXECUTE AS User=’SmallHat’;
go

–delete Wilma
DELETE FROM demo.person WHERE PersonNumber = ‘0000000102’;

–add Betty
INSERT INTO Demo.Person (PersonId,FirstName,LastName,PersonNumber, StatusCode, EmailAddress, InceptionTime,YachtCount)
VALUES(4,’Betty’,’Rubble’,’0000000153′,’Active’,’betty.rubble@aol.com’,’8/1/1960′,0);

–update Fred’s person number using only a masked column
UPDATE Demo.Person
SET    PersonNumber = ‘1111111114’
WHERE  PersonNumber = ‘0000000014’;

SELECT *
FROM   Demo.Person;
GO

REVERT;

Looking at the output, you can see that everything worked (Fred’s entry now starts with a 1, where it started with a 0 previously):

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 1——-14   Unknown    fXXX@XXXX.com                            1900-01-01    98
2           Barney     Rubble     0——-32   Unknown    bXXX@XXXX.com                            1900-01-01    32
4           Betty      Rubble     0——-53   Unknown    bXXX@XXXX.com                            1900-01-01    45

So it doesn’t stop any DDL or SELECT statment clauses other than the SELECT clause from working, even ORDER BY:

EXECUTE AS User=’SmallHat’;
GO
SELECT *
FROM   Demo.Person
ORDER  BY YachtCount;
GO
REVERT;

Which you can see is ordered by the YachtCount (if you look at the base data that is, the masked values are a bit confusing.)

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 1——-14   Unknown    fXXX@XXXX.com                            1900-01-01    43
4           Betty      Rubble     0——-53   Unknown    bXXX@XXXX.com                            1900-01-01    65
2           Barney     Rubble     0——-32   Unknown    bXXX@XXXX.com                            1900-01-01    18

This was another thing I didn’t expect. The ORDER BY clause is (was in previous version?) the last to be processed, which allowed you to use aliases from the SELECT in the ORDER BY. But now, the output of the query seems to be being masked later in the process. Take for example, this substring or PersonNumber, first as dbo:

select PersonId, substring(PersonNumber, 2,6) as PersonNumberSubstring
from   Demo.Person

PersonId    PersonNumberSubstring
———– ———————
1           111111
2           000000
4           000000

Now, as SmallHat, after granting rights to look at the plan:

GRANT SHOWPLAN TO SmallHat;
EXECUTE AS User=’SmallHat’;
GO
SELECT PersonId, substring(PersonNumber, 2,6) as PersonNumberSubstring
FROM    Demo.Person
ORDER BY  PersonNumberSubstring;
GO
REVERT;

The data is sorted correctly:

PersonId    PersonNumberSubstring
———– ———————
2           xxxx
4           xxxx
1           xxxx

But the output is masked, and the plan doesn’t show anything about masking:

image

Moving along, remember CTP, not RTM… Next, let’s try a stored procedure and a view. So let’s create a simple stored procedure using the same query as we have been using

CREATE PROCEDURE Demo.Person$select
AS
    SELECT PersonId, FirstName, LastName, PersonNumber, cast(PersonNumber as varchar(10)) as PersonNumber,
                                          YachtCount, cast(YachtCount as tinyint) as YachtCount
    FROM   Demo.Person
    WHERE  YachtCount = 1;
go
GRANT EXECUTE ON Demo.Person$select TO SmallHat;
go
–Plus, we revoke select on the person table, to show they can’t query the data using their security:
REVOKE SELECT ON Demo.Person To SmallHat;

Now we run it as that user:

EXECUTE AS User=’SmallHat’;
GO
EXEC Demo.Person$select;
GO
SELECT *
FROM   Demo.Person;
GO
REVERT;

I admit, I really didn’t see this coming that the data would be masked.

PersonId    FirstName  LastName   PersonNumber PersonNumber YachtCount YachtCount
———– ———- ———- ———— ———— ———- ———-
2           Barney     Rubble     0——-32   xxxx         39         0

And the output to show the user couldn’t access the data directly:

Msg 229, Level 14, State 5, Line 270
The SELECT permission was denied on the object ‘Person’, database ‘SimpleDemos’, schema ‘Demo’.

I expected that due to security chaining that we would not see the data masked, since the user’s rights to use the table doesn’t come into play (especially since I revovked them). If you want to elevate the user’s rights in the procedure you can use the WITH EXECUTE AS clause of the CREATE/ALTER PROCEDURE statement, such as:

ALTER PROCEDURE Demo.Person$select
WITH EXECUTE AS ‘BigHat’ –More typically this is AS OWNER, but using a more limited user is generally a better practice
AS
    SELECT PersonId, FirstName, LastName, PersonNumber, cast(PersonNumber as varchar(10)) as PersonNumber,
                                          YachtCount, cast(YachtCount as tinyint) as YachtCount
    FROM   Person
    WHERE YachtCount = 1;

and now you see the data as it exists in the table:

EXECUTE AS User=’SmallHat’
GO
EXEC Demo.Person$select;
GO
REVERT;

PersonId    FirstName  LastName   PersonNumber PersonNumber YachtCount YachtCount
———– ———- ———- ———— ———— ———- ———-
2           Barney     Rubble     0000000032   0000000032   1          1

Next, let’s use this table in a view. Since a view is a more lightweight container than a stored procedure that usually optimized as the code in the view at runtime, (and what happened with the stored procedure), I did expect this:

CREATE VIEW Demo.Person_View
AS
    SELECT PersonId, FirstName, LastName, PersonNumber,
           StatusCode, EmailAddress, InceptionTime, YachtCount
    FROM   Demo.Person;
GO
GRANT SELECT ON Demo.Person_View TO SmallHat;

Then execute it:

EXECUTE AS User=’SmallHat’;
GO
SELECT *
FROM   Demo.Person_View;
GO
REVERT;

PersonId    FirstName  LastName   PersonNumber StatusCode EmailAddress                             InceptionTime YachtCount
———– ———- ———- ———— ———- —————————————- ————- ———-
1           Fred       Flintstone 1——-14   Unknown    fXXX@XXXX.com                            1900-01-01    17
2           Barney     Rubble     0——-32   Unknown    bXXX@XXXX.com                            1900-01-01    59
4           Betty      Rubble     0——-53   Unknown    bXXX@XXXX.com                            1900-01-01    13

Finally, for a level of completeness short of seeing how a masked object would work in a user defined function that was used in a DEFAULT constraint (something that I would really suggest against) I will change Person$select to use the view (and remove WITH EXECUTE AS…)

ALTER PROCEDURE Demo.Person$select
AS
    SELECT PersonId, FirstName, LastName, PersonNumber, cast(PersonNumber as varchar(10)) as PersonNumber,
                                          YachtCount, cast(YachtCount as tinyint) as YachtCount
    FROM   Demo.Person_View
    WHERE  YachtCount = 1;
go
REVOKE SELECT ON Demo.Person_View TO SmallHat;
GO

EXECUTE AS User=’SmallHat’;
GO
EXECUTE Person$select;
GO
REVERT;

PersonId    FirstName  LastName   PersonNumber PersonNumber YachtCount YachtCount
———– ———- ———- ———— ———— ———- ———-
2           Barney     Rubble     xxxx         xxxx         0          0
3           Wilma      Flintstone xxxx         xxxx         0          0

Still works as you would expect. In conclusion, on first attempt, Dynamic Data Masking is a limited, but possibly useful tool to have in some circumstances. I see its use cases generally limited to reporting type databases where there is some data that is generally not needed for most users, but is useful to have. The primary limitation is that you can access the data using a WHERE clause so it would certainly not be appropriate for data like salary amount where a user could build a rudimentary sorting algorithm to determine all of the salaries in your database.

Load comments

About the author

Louis Davidson

Simple Talk Editor

See Profile

Louis is the editor of this Simple-Talk website. Prior to that, has was a corporate database developer and data architect for a non-profit organization for 25 years! Louis has been a Microsoft MVP since 2004, and is the author of a series of SQL Server Database Design books, most recently Pro SQL Server Relational Database Design and Implementation.